-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store Template's mappings as bytes for disk serialization #78746
Store Template's mappings as bytes for disk serialization #78746
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do this more like we do for mappings (maybe even reuse some code from there) to keep things simpler. No need to do any hacks around reading base64 from JSON.
if (uncompressedMapping.size() > 0) { | ||
builder.field(MAPPINGS.getPreferredName()); | ||
builder.map(reduceMapping(uncompressedMapping)); | ||
if (Metadata.CONTEXT_MODE_GATEWAY.equals(params.param(Metadata.CONTEXT_MODE_PARAM, Metadata.CONTEXT_MODE_API))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if this followed the exact same conventions we use for mappings, respecting the binary
parameter and serializing as bytes for everything but the API
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added binary
parameter and changed usage to similar to IndexMetada
Object compressed = values.get("compressed"); | ||
if (compressed == null) { | ||
return new CompressedXContent(Strings.toString(XContentFactory.jsonBuilder().map(values))); | ||
} else if (compressed instanceof String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this if we do things the same way we do them for mappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This handles situation when you use binary
parameter with JsonXContent
- it stores byte arrays as BASE64 encoded strings. Without it ToAndFromJsonMetadataTests#testSimpleJsonFromAndTo
fails
@@ -88,6 +89,15 @@ private static String differenceBetweenObjectsIgnoringArrayOrder(String path, Ob | |||
} else { | |||
return path + ": first element is null, the second element is not null"; | |||
} | |||
} else if (first instanceof byte[]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably also not necessary to extend things this way if we just do what we do for mappings and have a special binary
param path here I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is required for ESIntegTestCase#ensureClusterStateCanBeReadByNodeTool
to pass - this test uses binary
serialization and this is the first instance where we have to compare byte[]
in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we have that for mappings already, they should be binary serialized as well and appear to run through the exact same code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was different path with parsing IndexMetadata
->bytes->IndexMetada
and ComposableIndexMetadata
->bytes->UnknownMetadataCustom
when using ElasticsearchNodeCommand#namedXContentRegistry
so results differed when using binary
serialization. I changed ElasticsearchNodeCommand
to parse ComposableIndexMetadata
so the code is no longer needed.
@elasticmachine update branch |
@elasticmachine run elasticsearch-ci/part-2 |
@original-brownbear I simplified code and made it similar to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not sure why we have to do anything different than we do with mappings here in the tests and in part of production code. Can you explain what's different here?
new CompressedXContent(Strings.toString(XContentFactory.jsonBuilder().map(p.mapOrdered()))), MAPPINGS); | ||
PARSER.declareField(ConstructingObjectParser.optionalConstructorArg(), (p, c) -> { | ||
XContentParser.Token token = p.currentToken(); | ||
if (token == XContentParser.Token.VALUE_STRING) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we encode the template like this? This seems broken?
We don't seem to run into a base64 JSON string anywhere for mappings so we shouldn't for templates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shows up with combination of JsonXContent
and binary
parameter. I think we have that only in ToAndFromJsonMetadataTests#testSimpleJsonFromAndTo
. Thing is we don't serialize IndexMetadata
there because Metadata.toXContent
will filter them out if context is not XContentContext.API
so we don't have to serialize mappings for them as well. Test like below would fail exactly because we don't handle VALUE_STRING
case in IndexMetadata
, we just don't test it. It would work with SmileXContent
.
IndexMetadata.Builder indexMetadataBuilder = IndexMetadata.builder("test12")
.settings(settings(Version.CURRENT))
.numberOfShards(1)
.numberOfReplicas(0)
.putMapping("{\"mapping1\":{\"text1\":{\"type\":\"string\"}}}");
Map<String, String> params = new HashMap<>(2);
params.put("binary", "true");
params.put(Metadata.CONTEXT_MODE_PARAM, Metadata.CONTEXT_MODE_GATEWAY);
XContentBuilder builder = JsonXContent.contentBuilder();
builder.startObject();
indexMetadataBuilder.build().toXContent(builder, new ToXContent.MapParams(params));
builder.endObject();
IndexMetadata indexMetadata = IndexMetadata.fromXContent(createParser(builder));
assertNotNull(indexMetadata.mapping());
@@ -88,6 +89,15 @@ private static String differenceBetweenObjectsIgnoringArrayOrder(String path, Ob | |||
} else { | |||
return path + ": first element is null, the second element is not null"; | |||
} | |||
} else if (first instanceof byte[]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we have that for mappings already, they should be binary serialized as well and appear to run through the exact same code?
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine update branch |
💔 Backport failed
You can use sqren/backport to manually backport by running |
) This change the way we store mappings in `Template` during serialization to disk - instead storing it as map we use byte array that we already have. This avoids deserialization-serialization cycle during storing cluster state on disk. # Conflicts: # server/src/main/java/org/elasticsearch/cluster/coordination/ElasticsearchNodeCommand.java
…79522) This change the way we store mappings in `Template` during serialization to disk - instead storing it as map we use byte array that we already have. This avoids deserialization-serialization cycle during storing cluster state on disk. # Conflicts: # server/src/main/java/org/elasticsearch/cluster/coordination/ElasticsearchNodeCommand.java
* upstream/master: (24 commits) Implement framework for migrating system indices (elastic#78951) Improve transient settings deprecation message (elastic#79504) Remove getValue and getValues from Field (elastic#79516) Store Template's mappings as bytes for disk serialization (elastic#78746) [ML] Add queue_capacity setting to start deployment API (elastic#79433) [ML] muting rest compat test issue elastic#79518 (elastic#79519) Avoid redundant available indices check (elastic#76540) Re-enable BWC tests TEST Ensure password 14 chars length on Kerberos FIPS tests (elastic#79496) [DOCS] Temporarily remove APM links (elastic#79411) Fix CCSDuelIT for skipped shards (elastic#79490) Add other time accounting in HotThreads (elastic#79392) Add deprecation info API entries for deprecated monitoring settings (elastic#78799) Add note in breaking changes for nameid_format (elastic#77785) Use 'migration' instead of 'upgrade' in GET system feature migration status responses (elastic#79302) Upgrade lucene version 8b68bf60c98 (elastic#79461) Use Strings#EMPTY_ARRAY (elastic#79452) Quicker shared cache file preallocation (elastic#79447) [ML] Removing some code that's obsolete for 8.0 (elastic#79444) Ensure indexing_data CCR requests are compressed (elastic#79413) ...
Pinging @elastic/es-data-management (Team:Data Management) |
This change the way we store mappings in
Template
during serialization to disk - instead storing it as map we use byte array that we already have. This avoids deserialization-serialization cycle during storing cluster state on disk.